13 research outputs found

    A Retrospective Analysis of the Fake News Challenge Stance Detection Task

    Full text link
    The 2017 Fake News Challenge Stage 1 (FNC-1) shared task addressed a stance classification task as a crucial first step towards detecting fake news. To date, there is no in-depth analysis paper to critically discuss FNC-1's experimental setup, reproduce the results, and draw conclusions for next-generation stance classification methods. In this paper, we provide such an in-depth analysis for the three top-performing systems. We first find that FNC-1's proposed evaluation metric favors the majority class, which can be easily classified, and thus overestimates the true discriminative power of the methods. Therefore, we propose a new F1-based metric yielding a changed system ranking. Next, we compare the features and architectures used, which leads to a novel feature-rich stacked LSTM model that performs on par with the best systems, but is superior in predicting minority classes. To understand the methods' ability to generalize, we derive a new dataset and perform both in-domain and cross-domain experiments. Our qualitative and quantitative study helps interpreting the original FNC-1 scores and understand which features help improving performance and why. Our new dataset and all source code used during the reproduction study are publicly available for future research

    A Machine-Learning-Based Pipeline Approach to Automated Fact-Checking

    Get PDF
    In the past couple of years, there has been a significant increase of the amount of false information on the web. The falsehoods quickly spread through social networks reaching a wider audience than ever before. This poses new challenges to our society as we have to reevaluate which information source we should trust and how we consume and distribute content on the web. As a response to the rising amount of disinformation on the Internet, the number of fact-checking platforms has increased. On these platforms, professional fact-checkers validate the published information and make their conclusions publicly available. Nevertheless, the manual validation of information by fact-checkers is laborious and time-consuming, and as a result, not all of the published content can be validated. Since the conclusions of the validations are released with a delay, the interest in the topic has often already declined, and thus, only a small fraction of the original news consumers can be reached. Automated fact-checking holds the promise to address these drawbacks as it would allow fact-checkers to identify and eliminate false information as it appears on the web and before it reaches a wide audience. However, despite significant progress in the field of automated fact-checking, substantial challenges remain: (i) The datasets available for training machine learning-based fact-checking systems do not provide high-quality annotation of real fact-checking instances for all the tasks in the fact-checking process. (ii) Many of today’s fact-checking systems are based on knowledge bases that have low coverage. Moreover, because for these systems sentences in natural language need to be transformed into formal queries, which is a difficult task, the systems are error-prone. (iii) Current end-to-end trained machine learning systems can process raw text and thus, potentially harness the vast amount of knowledge on the Internet, but they are intransparent and do not reach the desired performance. In fact, fact-checking is a challenging task and today’s machine learning approaches are not mature enough to solve the problem without human assistance. In order to tackle the identified challenges, in this thesis, we make the following contributions: (1) We introduce a new corpus on the basis of the Snopes fact-checking website that contains real fact-checking instances and provides high-quality annotations for the different sub-tasks in the fact-checking process. In addition to the corpus, we release our corpus creation methodology that allows for efficiently creating large datasets with a high inter-annotator agreement in order to train machine learning models for automated fact-checking. (2) In order to address the drawbacks of current automated fact-checking systems, we propose a pipeline approach that consists of the four sub-systems: document retrieval, stance detection, evidence extraction, and claim validation. Since today’s machine learning models are not advanced enough to complete the task without human assistance, our pipeline approach is designed to help fact-checkers to speed up the fact-checking process rather than taking over the job entirely. Our pipeline is able to process raw text and thus, make use of the large amount of textual information available on the web, but at the same time, it is transparent, as the outputs of sub-components of the pipeline can be observed. Thus, the different parts of the fact-checking process are automated and potential errors can be identified and traced back to their origin. (3) In order to assess the performance of the developed system, we evaluate the sub-components of the pipeline in highly competitive shared tasks. The stance detection component of the system is evaluated in the Fake News Challenge reaching the second rank out of 50 competing systems.2 The document retrieval component together with the evidence extraction sub-system and the claim validation component are evaluated in the FEVER shared task.3 The first two systems combined reach the first rank in the FEVER shared task Sentence Ranking sub-task outperforming 23 other competing systems. The claim validation component reaches the third rank in the FEVER Recognizing Textual Entailment sub-task. (4) We evaluate our pipeline system, as well as other promising machine learning models for automated fact-checking, on our newly constructed Snopes fact-checking corpus. The results show that even though the systems are able to reach reasonable performance on other datasets, the systems under-perform on our newly created corpus. Our analysis reveals that the more realistic fact-checking problem setting defined by our corpus is more challenging than the problem setting posed by other fact-checking corpora. We therefore conclude that further research is required in order to increase the performance of the automated systems in real fact-checking scenarios

    Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data

    No full text
    Automatic summarization has so far focused on datasets of ten to twenty rather short documents of mostly news articles. But automatic systems could in theory analyze hundreds of documents from a range of sources and provide an overview to the interested reader. Such a summary would ideally present the most general issues in a specific topic and allow for more in-depth information on specific aspects within said topic. In this paper, we present a new approach for creating hierarchical summarization corpora by first, extracting relevant content from large, heterogeneous document collections using crowdsourcing and second, ordering the relevant information hierarchically by trained annotators. Our resulting corpus can be used to develop and evaluate hierarchical summarization systems

    UKP Snopes Corpus

    No full text
    This corpus is based on the Snopes fact-checking website and provides annotations for training machine learning models for different tasks in the fact-checking process: document retrieval, stance detection, evidence identification and claim validation. The corpus contains 6,422 validated claims, 16,507 evidence text snippets (annotated with sentence level evidence), and 14,296 documents with their sources (URLs). Please note: We crawled and provide the data according to the regulations of the German text and data mining policy, and we are allowed to share the corpus only for research purposes. Thus, in order to be able to download the corpus, you need to get in contact with us. If you use the corpus in academic works, please cite our CoNLL paper

    A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking

    No full text
    Automated fact-checking based on machine learning is a promising approach to identify false information distributed on the web. In order to achieve satisfactory performance, machine learning methods require a large corpus with reliable annotations for the different tasks in the fact-checking process. Having analyzed existing fact-checking corpora, we found that none of them meets these criteria in full. They are either too small in size, do not provide detailed annotations, or are limited to a single domain. Motivated by this gap, we present a new substantially sized mixed-domain corpus with annotations of good quality for the core fact-checking tasks: document retrieval, evidence extraction, stance detection, and claim validation. To aid future corpus construction, we describe our methodology for corpus creation and annotation, and demonstrate that it results in substantial inter-annotator agreement. As baselines for future research, we perform experiments on our corpus with a number of model architectures that reach high performance in similar problem settings. Finally, to support the development of future models, we provide a detailed error analysis for each of the tasks. Our results show that the realistic, multi-domain setting defined by our data poses new challenges for the existing models, providing opportunities for considerable improvement by future systems

    UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification

    No full text
    The Fact Extraction and VERification (FEVER) shared task was launched to support the development of systems able to verify claims by extracting supporting or refuting facts from raw text. The shared task organizers provide a large-scale dataset for the consecutive steps involved in claim verification, in particular, document retrieval, fact extraction, and claim classification. In this paper, we present our claim verification pipeline approach, which, according to the preliminary results, scored third in the shared task, out of 23 competing systems. For the document retrieval, we implemented a new entity linking approach. In order to be able to rank candidate facts and classify a claim on the basis of several selected facts, we introduce two extensions to the Enhanced LSTM (ESIM)
    corecore